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Abstract 

In this paper second order semi-Markov reward models are presented and 
equations for the higher order moments of the reward process are presented 
for the first time and applied to wind energy production. A real application 
is executed by considering a database, freely available from the web, in which 
are included wind speed data taken from L.S.I. - Lastem station (Italy) and 
sampled every 10 minutes. We compute the expected total energy produced 
by using the blade Aircon HAWT - 10 kW. 

Keywords: semi-Markov chains, synthetic time series, autocorrelation 



1. Introduction 

Nowadays the subject of wind speed modeling is becoming increasingly 
important. Wind power generation companies are intensely interested in 
quantifying the energy which can be extracted from wind by using different 
types of wind turbines in a given location. 

Models to predict future wind speed and power production have been pro- 
posed extensively in last two decades. In general the wind speed is divided 
into a finite number of states. Transitions between states occur randomly in 
time. The installed blade produces energy depending on its technical char- 
acteristic and on the wind speed status. 

Markov chains have been extensively used to model the behavior of wind 
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speed data. Several authors have discussed the use of techniques of Markov 
processes in wind speed modeling, see e.g., Shamshad et al. (2005), Nfaoui 
et al. (2004) and Youcef et al. (2003). The Markovian assumption has, espe- 
cially in the modeling of wind speed, several flaws. In discrete time, waiting 
times in a state before making a transition to another state are geometrically 
distributed and consequently the memoryless property applies. This leads 
to a great simplification of the model which is unable to reproduce correctly 
the statistical properties of the real wind speed process. 

Semi-Markov chains do not have this constraint, because the waiting time 
distribution function in the states can be of any type and this allow the data 
to speak for themselves without any restriction. D'Amico et al. (2011) was 
the first paper were semi-Markov chains were applied in the modeling of 
wind speed. In that paper first and second order semi-Markov models were 
proposed with the aim of generate reliable synthetic wind speed data. It 
was shown that all the semi-Markov models perform better than the Markov 
chain model in reproducing the statistical properties of wind speed data. In 
particular, the model recognized as being the more suitable is the second 
order semi-Markov model in state and duration. 

One of the purposes of this paper is to provides methods for computing 
the accumulated energy produced by a blade in a temporal interval [0, T]. 
To this end we introduce semi-Markov reward processes. 

Semi-Markov reward processes were applied in several domains, for exam- 
ple De Dominicis and Manca (1986) applied non-homogeneous semi-Markov 
reward processes to insurance disability problems. In Stenberg et al. (2007) 
backward semi-Markov reward processes were considered for calculating any 
integer moment of the reward process. 

In this paper we give a generalization of these results by defining the 
second order semi-Markov reward process in state and duration and giving 
relations for computing the higher order moments of this process. We apply 
the theoretical results in computing the accumulated energy produced by a 
blade during a bounded time interval. The expected total energy produced 
gives important information on the feasibility of the investment in a wind 
farm and the riskiness of the investment can be measured in terms of vari- 
ance, skewness, and kurtosis of the reward process. Finally notice that the 
technological characteristics of different blades are captured by the perma- 
nence reward and consequently we are able to choose among different blades 
to be installed at a given location. Additionally, we propose to employ a 
matrix notation that makes calculations easier and also provides a compact 
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form for equations of moments of the reward process. 

2. The second order semi-Markov chain in state and duration 

In this section we describe briefly the second order semi-Markov chain in 
state and duration, see D'Amico et al. (2012) for additional results. 

Let consider a finite set of states E = {1,2,. ..,5} in which the system 
can be into and a complete probability space (O, F, P) on which we define 
the following random variables: 

J n :Q^E, T n : Q ->■ IN. (1) 

They denote the state occupied at the n-th transition and the time of the 
n-th transition, respectively. To be more concrete, by J n we denote the wind 
speed at the n-th transition and by T n the time of the n-th transition of the 
wind speed. 

We assume that 

P[^n+i = j, T n+1 -T n = t\a{J s , T s ), J n = k, J n _i =i,T n - T„_i = x, < s < n] 
= ^[Jn+i = ji T n +\ — T n = t\J n = k, J n -\ = i,T n — T n _i = x] := xQi.k,j{t)- 

(2) 

Relation (2) asserts that, the knowledge of the values J n , J n _i,T„ — T n _i 
suffices to give the conditional distribution of the couple J n+ x,T n+1 — T n 
whatever the values of the past variables might be. Therefore to make prob- 
abilistic forecasting we need the knowledge of the last two visited state and 
the duration time of the transition between them. For this reason we called 
this model a second order semi-Markov chains in state and duration. 

It should be remarked that in the paper by Limnios and Opri§an (2003) 
were defined nth order semi-Markov chains in continuous time. Anyway the 
dependence was only on past states and not on durations. 

The conditional probabilities 

xQi.k,j{i) = ^[Jn+i = ji T n -\-\ T n = t\J n = fc, J n —i = i, T n T n -\ = %] 

are stored in a matrix of functions q = ( x Qi.k,j(t)) named the second order 
kernel (in state and duration). The element x qi.kj{t) represents the probabil- 
ity that next wind speed will be in speed j at time t given that the current 
wind speed is k and the previous wind speed state was % and the duration in 
wind speed i before of reaching wind speed k was equal to x units of time. 



3 



From the knowledge of the kernel we can define the cumulated second 
order kernel probabilities: 

xQi.k,j{t) '■— P[Jn+i = j>T n+ i — T n < t\J n — k, J n -i = i,T n — T n -\ = x] 
t 

s=l 

(3) 

The process {</„} is a second order Markov chain with state space E 
and transition probability matrix X P = x Q(oo). We shall refer to it as the 
embedded Markov chain. 

Define the unconditional waiting time distribution function in states k 
coming from state % with duration x as 

xHi.k{t) := P[T ra+1 — T n < t\J n = k, J n _i = i,T n — T n _\ — x\ — ^ ] x Qi.k,j{t)- 

jeE 

(4) 

The conditional cumulative distribution functions of the waiting time in 
each state, given the state subsequently occupied is defined as 

xGi.kjiy) = Ppn+i — T n < t\J n = k, J n -\ = i, J n +\ = j, T n — T n _i = x\ 

xPi.kj s=1 

Define by iV(£) = sup{n : T n < t} Vt G IN. We define the second 
order (in state and duration) semi-Markov chain as Z(t) = (Z 1 (t) , Z 2 (£)) = 

(JjV(i)-l, JN(t)))- 

For this model ordinary transition probability functions and transition 
probabilities with initial and final backward recurrence times were defined 
and computed D'Amico et al. (2012). 

3. The Reward Model 

In this section by following the line of research in Stenberg et al. (2007) 
we determine recursive equations for higher order moments of the second 
order semi-Markov reward chain in state and duration. 

Let £(t) denote the accumulated discounted reward during the time in- 
terval (0,t] defined by the following relation, 

J2 ^( ll )- 2 V'J J v (u) _ 2 ,J i v (ll) _ 1 ;J J v (ll) (5(M))e- 5M (6) 
0<u<t 
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where: B{u) = u — T/v(«) is the backward recurrence time process, X N ^ := 
Tn( u )+i — Tn( u ) is the sojourn time in state Jn(u) before the N(u) + 1 tran- 
sition and e~ 6 with 5 G [0, 1] is a one period deterministic discount factor. 

The reward v„, . „ibj. T , , „ /„, , ABiuX) is more general than those 
considered in Stenberg et al. (2007) because at the same time it is state de- 
pendent on the current state Jn(u) of the system; it depends on the last two 
visited states Jjv(«)-2, Jn(u)-i] it is duration dependent in the current state 
because it is a function of the backward process B(u) and finally it depends 
on the sojourn time in the past states being dependent on X N ^_ 2 - 

Let also denote by x ,v£i,k(t) the random variable, which has the distribu- 
tion the same with the conditional distribution for the random variable £(t) 
given that 

Jn(o)-i = k, Jn(o)-2 = i, B(T N (pyi) = v, X N ( y 2 = x 

and let denote by x , v V$\t) := E[( X:V ^ k (t)) n \. 

In order to propose a matrix notation that simplifies calculations and 
provides a compact form for next equations we need to introduce the adopted 
notation and products. 

Given two m x n matrices A and B, their Hadamard matrix product M 
gives the m x n matrix C whose generic element is given by: 

C{j ctij bij . 

Let A be a m 2 x m matrix and B be a m 2 column vector, their <g> matrix 
product gives the m 2 column vector whose elements, for all i, k e {1, 2, m) 
are expressed by 

m 

C(i-l)-\m\+k = y^^(i-l)-|m|+fc,j-B(fc-l)-|m|+j,l- 
J'=l 

The first order moment of the reward process x ,v£i,k(t) is computed in the 
following Theorem. 

Theorem 1. The first order moment of the second order semi-Markov chain 
in state and duration satisfies the following matrix equation: 

t 

x ,vV {1) (t) = XtV D(t) ® x> Mt) + 5ZU B (s) • 1|E|) ^ 

(7) 

s=l 
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where Vi,k e E 

x V(v + u) = ( x V(i-i)-|s|+fc( u + ^)e _5u ) = ( x ^i,fc ifc (u + n)e~ 5u ) , 
s *(u + u) |£| 2 x 1 and X)W *(t) = Y? u =i *^( v + M ) * s \ E \ 2 x ^ 

XjV D(i) = ( x>t ,-D(i-i).|£;|+fc(i)) = ( x ,„A,fc(0) = ( ^3 ^ / n " 

^b( s ) = U^-iW*)) = (rr^^y) 

and 

1| B | T = [1,1,...,1] T . 

Proof: Let consider the random variable x ,v£i,k(t)- The time of next transi- 
tion Tat( )+i can be greater of t or not. Consequently it results that: 

*,vV${t) '■= E[ XjV ^ k (t)] = E[ x ^ iyk (t)l {TN(0)+1>t} ] + E[ XjV £ itk (t)l{ TN{0)+1 < t y\. 

(8) 

In the case Tjv(o)+i > t we have that 

t 

z,v£i,k(t) = ^ ^i,k-,k(u + v)e~ Su (9) 

u=l 

and this event occurs with probability 

P(7jv(o)+i > *|7jv(o)+i > 0,Tat( ) = —v, Jn(o) = fc, 7jv(o)-i = — n — Jjv(o)-i = «) 
_ P (7jv(o)+i > £)7V(o)+i > 0, Tjv(o) = — n|Jiv(o) = k,T N ( )-i = i,^N(o)-i = x) 

P (^JV(0)+1 > Oj^JV(O) = — ^I-^V(O) = fc, ^JV(0)-1 = ^^iV(0)-l = 
_ P (Xjv(O) > * + H>Mo) = fc,Tjv(0)-l = i,^AT(0)-l = 
IP (-XjV(O) > ^I^AT(O) = fc, ^JV(0)-1 = -X"iV(0)-l = ^) 

1 - x H itk (t + v) 



, , , x,vDi ; k(t)- 
1 - x-tli,k{V) 



(10) 
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Then it results that 

t 

E U,w&,k(*)l{T w(0)+1 >t}] =x,v D itk (t) ^ x4>i,k;k( U + v)e~ &U . (11) 



u=l 



The right hand side of (11) can be expressed in matrix form as follows: 

XjV B(t) m *,„*(t). (12) 

In the second case, when Tjv(o)+i < t, if we consider the next visited state 
Jn(o)+i an d the time of next transition Tjv(o)+i we have: 



t n(o)+i 



-<5T]v(o)+i 



x I ^i,fc(*)=j Yl ^k-Av + s')e Sa ' + v+T N{0)+1 ,otk,j N(0)+1 (t - s)e~ 

y s'=i 

(13) 

The event {Jtv(o)+i = j, ^jv(o)+i — s } occurs with probability 

P (Jjv(o)+i = 3,Tn(o)+i = s|Tjv( 0)+1 > 0,Ttv(o) = — u, <//v(o) = fc 3 ^V(o)-i = —v — x, Jn(o)-i = i) 
_ P (</jv(o)+i = 3,Tn(o)+i = s,Tjv( ) +1 > 0,T/v( ) = — u| J^o) = fc)7jv(o)-i = z>^Gv(o)-i = a;) 

P (^7V(0)+l > 0,Tjv(o) = — ^|^JV(0) = fe, 7jV(0)-l = «)-XjV(0)-l = x) 
_ P (</jV(0)+l = J) -^AT(O) = S + u| JjV(O) = ?JV(0)-1 = -XjV(0)-l = 
P (^iV(O) > W| Jjv(O) — fcj^JV(0)-l — «>-XiV(0)-l — ^) 

" 1 - ^) " ^ WW - 

(14) 

Notice that the random variable ?;+T JV(0)+1 ,o^A;,j iV{0)+1 (^ — s) is independent 
of the distribution of the joint random variable (jjv(o)+i> ^V(o)+i) because 
the accumulation process has the Markov property at transition times and 
consequently once the state Jat(o)+i and the T/v(o)+i are known its behaviour 
doesn't depends on the distribution of ( Jjv(o)+i, 7jv(o)+i)- Then by taking the 
expectation in (13) we get 



E[ x , l )Ci,fc(t)l{T i 



AT(0) + 1<*} 



j£-B s=l ' v ' s'=l 



(15) 
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The right hand side of (15) can be expressed in matrix form as follows: 



t 



(16) 

+ J2 e ~ SS X+ s, V {1) (t- S ). 

3=1 

A substitution of (12) and (16) in (8) concludes the proof. 

By using similar techniques it is possible to get recursive equations for 
the higher order moments of the reward process. 
First note that 

.,^f(t):=E[(, |tl ^W) B ] (17) 

= ^[(x, 1 ,^,fc(^)) n l{T JV(0)+1 >t}] + ^[(x,^,fcW) n l{T iv(0)+1 <t}]- 

In the case T/v(o)+i > t we have that 

t 

UvZi,k(t)T = (Yl ^Mk{u + v)e~ 8u Y (18) 

u=l 

and this event occurs with probability x , v D itk (t), see (10). Consequently it 
results that 

t 

Ek^(t)MT N(0)+1 >t } ] =»,„ Aj(*)(X) *V*A*(« + «)*-**)" (19) 

u=l 

In the opposite case, when Tjv(o)+i < ^ we have that 

( T JV(0) + 1 
'^i,k-,k(v + s')e- 5s ' + v+T N(0)+1 fi£k,j N(0)+1 (t - s)e 
s'=l 

T N(0) + 1 

( £ ^Mv + s')e- Ss y+ v+TN(0)+lfi ^^ 



s'=l 

„,../. \ r»-l 



+ E ( I ) (E AmC + U (0)+1 ,A-' (t - S)e " BT " m+ ')- 

(20) 



The event {J N (o)+i = j, ^V(o)+i = $} occurs with probability XiV B( i -. 1 y\ E \ +k j(s). 

Then, by using the already mentioned independence between t) +r JV(0)+1 ,o^fc,j JV(0)+1 it— 
s) and the joint random variable (Jjv(q)+i> ^/v(o)+i) , by taking the expectation 
in (20) we get: 

Ek,»d?(«)i{T„,„ )+ ,< t >] = 

j&E s=l x t>lty 1 \s'=l / 

jg_E s=l ' v ' 

■^(t-.Je-"'. 

If we substitute (19) and (21) in (17) and we represent the resulting 
expression in matrix form we obtain the following equation: 

t 

*,vVW(t) = x , v D(t) B x , v ¥ n \t) + {x,vB(s) ■ 1\ E \) B x , v ¥ n \s) 

8=1 

t 

+ J2 e ~ Ssn ^B(s) ® „ +fl , V (n) (t - s) (22) 

8=1 

t n—l y \ 

( ? ) ^* (n ~°(*) H ® - «)) . 

8 = 1 1=1 ^ ' 

It should be remarked that if Vz £ J3 and Vi 6 IN we have that 

xqi.kjit) = q k j(t), x ipi. k j(yt) = ipkj(t) 

then the second order semi-Markov reward chain model in state and duration 
collapses in a standard semi-Markov reward chain model and we recover 
exactly the results by Stenberg et al. (2007). 

4. Application to real data 

To check the validity of our model we perform a comparison of the behav- 
ior of real data and wind speeds generated through Monte Carlo simulations 
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Figure 1: Time series of wind speed and its empirical distribution. 



based on the models described above. In this section we describe the database 
of real data used for the analysis, the method used to simulate synthetic wind 
speed time series and, at the end, we compare results from real and simulated 
data. 

The data used in this analysis are freely available from http : / /www.lsi — 
lastem.it/meteo/page/dwnldata.aspx. The station of L.S.I. -Lastem is sit- 
uated in Italy at N 45 28' 14,9" - E 9 22' 19,9" and at 107 m of altitude. 
The station use a combined speed-direction anemometer at 22 m above the 
ground. It has a measurement range that goes from to 60 m/s, a threshold 
of 0,38 m/s and a resolution of 0,05 m/s. The station processes the speed 
every 10 minute in a time interval ranging from 25/10/2006 to 28/06/2011. 
During the 10 minutes are performed 31 sampling which are then averaged 
in the time interval. In this work, we use the sampled data that represents 
the average of the modulus of the wind speed (m/s) without a considered 
specific direction. The database is then composed of about 230thousands 
wind speed measures ranging from to 16 m/s. The time series, together 
with its empirical probability density distribution are represented in Figure 
1. The lower panel of the same figure shows that the wind speed follows a 
Weibul distribution. 



10 



To be able to model the wind speed as a semi-Markov chain the state 
space of wind speed has been discretized. In the example shown in this 
work we discretized wind speed into 8 states chosen to cover all the wind 
speed distribution. The state space is numerically represented by the set 
E = {0 - 1, 1 - 2, 2 - 3, 3 - 4, 4 - 5, 5 - 6, 6 - 7, > 7}(m/s). From the 
discretized wind speeds we estimated the probabilities P and G to generate 
synthetic trajectories for a second order semi-Markov model in state and 
duration. 

We give here the Monte Carlo algorithm used to simulate the trajectory 
in the time interval [0, T] where T is equal to the time length of the real 
data. The output of the algorithm consists in the successive visited states 
{Jo, J\i •••} and the jump times {T ,Xi, ...} up to the time T. 
Denote by X n = T n+1 - T n . 

1 Set n = 0, J_x = i, J = k, X_i = x, horizon time = T; 

2 Sample J from x n -xPj n -xJ n {-) and set J n+1 = J(u); 

3 Sample W from x n _ x G Jn _ uJn ,j n+1 {-) and set T n+1 = T n + W(u); 

4 If T n+ i > T stop, else set X n = T n+ i — T n and n = n + 1 and go to 2. 

We show in Figure 2 a short sample of 160 hours of the real and simulated 
trajectories just for comparison reason. We apply our model to a real case of 
energy production. For this reason we choose a commercial wind turbine, a 
10 kW Aricon HAWT with a power curve given in Figure 3. The power curve 
of a wind turbine represents how it produces energy as a function of the wind 
speed. In this case we have no production of energy in the interval 0-2 m/s, 
the wind turbine produces energy linearly from 3 m/s to 10 m/s, than, with 
increasing wind speed the production remain constant until the limit of wind 
speed in which the wind turbine is stopped for structural reason. Through 
this power curve we transform the wind speed of our real and synthetic data 
into the equivalent energy produced at each different state of wind speed, 
in order to validate our model in a real case of energy production. For this 
reason, to check the proper functioning of the proposed model, we do several 
comparisons between real and synthetic data. In Figure 4 is showed the daily 
energy production of real and synthetic data. Another comparison between 
real and synthetic data is made by the cumulated energy produced in a time 
interval. In Figure 5 are showed these values as a function of time. The 
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Figure 3: Power curve of the 10 kW Aricon HAWT wind turbine. 
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Figure 4: Comparison of the daily energy production for the real and simulated data. The 
error bar represent one standard deviation. 
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Figure 5: Comparison between the cumulated energy produced by real and simulated data. 

two dashed line represents the standard deviation of the synthetic data. It 
is interesting to note that the trend of the real cumulated energy produced 
remains almost inside the area formed by the two dashed line. We generate 
500 different trajectories and for each one we calculate the cumulated energy 
for all the length of the time series. In Figure 6 is plotted the distribution 
of the cumulated energy produced by the different trajectories. The black 
point represents the energy produced by the real data. The figure shows that 
simulated and real data give the same value for produced energy in the time 
interval within statistical error. 
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Figure 6: Distribution of the cumulated energy produced by the real and simulated tra- 
jectories. 

5. Conclusion 

The wind is a very unstable phenomenon characterized by a sequence of 
lulls and sustained speeds, and a good wind generator must be able to repro- 
duce such sequences. We have then modeled wind speed through a second 
order semi-Markov model. Our work is motivated by the presence of persis- 
tence in the wind speed process. The purposes of this paper is to provides 
theoretical methods for computing the accumulated energy produced by a 
blade in a temporal interval [0, Tj. We have shown, by means of Monte Carlo 
simulations, that the proposed model is able to reproduce well the behavior 
of real data as far as energy production from wind speed is concerned. 
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